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Abstract 



In this paper, we develop an approach for the exact determination of the minimum sample 
, size for the estimation of a Poisson parameter with prescribed margin of error and confidence 

level. The exact computation is made possible by reducing infinite many evaluations of cover- 
age probability to finite many evaluations. Such reduction is based on our discovery that the 
' minimum of coverage probability with respect to a Poisson parameter bounded in an interval 

is attained at a discrete set of finite many values. 



, ^ ! 1 Introduction 



The estimation of a Poisson parameter finds numerous applications in various fields of sciences 
and engineering [3]. The problem is formulated as follows. 



(•~^ ■ Let X be a Poisson random variable defined in a probability space (fi, Pr) such that 

■ Pr{X = /c} = ^^li — , k = 0, 1, • • • , where A > is referred to as a Poisson parameter. It is a 

frequent problem to estimate A based on n identical and independent samples Xi, - ■ ■ ,X„ of X. 
^ ■ An estimate of A is conventionally taken as A„ = — - . The nice property of such estimate is 

^ ' that it is of maximum likely-hood and possesses minimum variance among all unbiased estimates. 

A crucial question in the estimation is as follows: 

Given the knowledge that A belongs to interval [a, 6], what is the minimum sample size n that 
guarantees the difference between A„ and A be bounded within some prescribed margin of error 
with a confidence level higher than a prescribed value? 

The main contribution of this paper is to provide exact answer to this important question. The 
paper is organized as follows. In Section 2, the techniques for computing the minimum sample 
size is developed with the margin of error taken as a bound of absolute error. In Section 3, we 
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derive corresponding sample size method by using relative error bound as the margin of error. In 
Section 4, we develop techniques for computing minimum sample size with a mixed error criterion. 
Section 5 is the conclusion. The proofs are given in Appendices. 

Throughout this paper, we shall use the following notations. The set of integers is denoted 
by %. The ceiling function and floor function are denoted respectively by [.] and [.J (i.e., \x\ 
represents the smallest integer no less than x; [xj represents the largest integer no greater than 
x). The multivariate function S{n,k,l, X) means 

S'(n,fc,;,A) = X;'=feT— • The left limit as 7] 
tends to is denoted as lim^^o- The other notations will be made clear as we proceed. 

2 Control of Absolute Error 

Let e € (0, 1) be the margin of absolute error and 6 € (0, 1) be the confidence parameter. In many 
applications, it is desirable to find the minimum sample size n such that 



for any A € [a, b]. Here Pr ||A„ — A| < ^ j is referred to as the coverage probability. The interval 
[a, b] is introduced to take into account the knowledge of A. The exact determination of minimum 
sample size is readily tractable with modern computational power by taking advantage of the 
behavior of the coverage probability characterized by Theorem [1] as follows. 

Theorem 1 Let < e < 1 and < a < b. Let Xi, ■ ■ ■ , Xn be identical and independent Poisson 
random variables with mean X G [a, 6]. Let Xn = — ^^^^n, the minimum o/Pr{|A„ — A| < e} 
with respect to X ^ [a, b] is achieved at the finite set {a, 6}U{^+eG {a,b) :^€Z}U{^— 
(a, b) : i G Z}, which has less than 2n(b — a) + 4 elements. 

See Appendix A for a proof. The application of Theorem [T] in the computation of minimum 
sample size is obvious. For a fixed sample size n, since the minimum of coverage probability with 
A S [a, b] is attained at a finite set, it can determined by a computer whether the sample size n 
is large enough to ensure Pr ||A„ — A| < e| > 1 — (5 for any A G [a, b]. Starting from n = 2, one 
can find the minimum sample size by gradually incrementing n and checking whether n is large 
enough. 





3 Control of Relative Error 



Let £ € (0, 1) be the margin of relative error and 6 G (0, 1) be the confidence parameter. It is 
interesting to determine the minimum sample size n so that 



for any A G [a, b]. As has been pointed out in Section 2, an essential machinery is to reduce infinite 
many evaluations of the coverage probability Pr{| A„ — A| < eX} to finite many evaluations. Such 
reduction can be accomplished by making use of Theorem [2] as follows. 

Theorem 2 Let < e < 1 and < a < b. Let Xi, ■ ■ ■ , Xn be identical and independent Poisson 
random variables with mean A G [a, b]. Let A.„ = ' ' . Then, the minimum o/Pr < --\ — - < £ r 
with respect to \ ^ [a, b] is achieved at the finite set {a, 6} U G (a, 6) : ^ G Z} U { n{i^e) ^ 

(a, b) : i G Z}, which has less than 2n{b — a) + 4 elements. 

See Appendix B for a proof. 



4 Control of Absolute Error or Relative Error 



Let £a G (0, 1) and G (0, 1) be respectively the margins of absolute error and relative error. 
Let 5 G (0, 1) be the confidence parameter. In many situations, it is desirable to find the smallest 
sample size n such that 



Pr 



A| < Ea or 



A„ — A 



A 



< er \ > I - 8 



for any A G [a, 6]. To make it possible to compute exactly the minimum sample size associated 
with ([T]), we have Theorem [3] as follows. 

Theorem 3 Let < < 1, < e,. < 1 and < a < ^ < b. Let Xi,--- be identical 



and independent Poisson random variables with mean A G [a, b] . Let X„ 



. Then, the 



minimum 



o/Pr ||An -\\<ea 



or 



< e^.| with respect to \ (z [a, b] is achieved at the finite 



set {a, 6, 1^} u + G (a, 1^) : £ G Z} U {1 - e„ G (f^, 6) : £ G Z} U g (a, |-) : £ G 

Z} U { ^(1-' ) ^ ^) • ^ ^ which has less than 2n{b — a) + 7 elements. 

Theorem [3] can be shown by applying Theorem [T] and Theorem [2] with the observation that 
Pr \ |A„ - X\ <£a or 



A„ — A 






A 





Pr ||A„ - A| < Eaj for A G 
Pr I 



\n A 



I for AG (^1^,6 



By virtue of Chernoff bounds, it can be shown that, for any e G (0, 1), 



Pr{A„, < (1 - e)X} < 
Pr{A„ > (1 + e)A} < 



(l-e)i- 

nA 



nA 



< exp 



Ane^ 



(1 + ^) 



l+e 



< exp (-(21n2- l)Ane2) . 



3 



As a result, Pr{|A,„ — A| > eA} < 6 if 

A > 



(2 In 2 - l)ne2' 

Therefore, to check whether ([1]) is satisfied for any A G [a, b], it suffices to check ([1]) for 



a < A < min < b, 



(2 In 2 - l)ne2 J ' 

Finally, we would like to point out that similar characteristics of the coverage probability can 
be shown for the problem of estimating binomial parameter or the proportion of finite population, 
which allows for the exact computation of minimum sample size. For details, see our recent papers 

HE]. 

5 Conclusion 

We have developed an exact method for the computation of minimum sample size for the estima- 
tion of Poisson parameters, which only requires finite many evaluations of the coverage probability. 
Our sample size method permits rigorous control of statistical sampling error. 



A Proof of Theorem [T] 

Define K = XlILi -^i 









Pr 1 








n 





Pr {g{X) < K < hiX)} 

(. J 

where 

g{\) = max(0, [n(A - e)\ + 1), h{X) = \n{X + e)] - 1. 

It should be noted that C(A), g{X) and h{X) are actually multivariate functions of A, e and n. 
For simplicity of notations, we drop the arguments n and e throughout the proof of Theorem [TJ 
We need some preliminary results. 

Lemma 1 Let Xe = — e where £ e Z. Then, h{X) = h{Xe+i) = I for any X G (A^, A^+i). 

Proof. For A G (A^, A^+i), we have < n (A — A^) < 1 and 

h{X) = \n{X + e)'\-l 

= \n{Xi + e + A - X()'\ - 1 



n e + e + A — A< 

\n 

l-l+\n{X- A,)l 
+ 1 



- 1 



n 



£ + e 



n 



h{X, 
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□ 



Lemma 2 Let = |;+e where i € Z. Then, g{X) = g{Xe) = max{0, for any A G (A^, A^+i). 
Proof. For A G (A^, A^+i), we have — 1 < n (A — A^+i) < and 



g{X) = max(0, [n(A - e)\ + 1) 

= max(0, [n(A£+i - e + A - A£+i)J + 1) 



max I 



max I 



n 



n 



+ 1 

n 

+ 1 
n 



+ e — e 
+ e — e 



+ [n{\ - + 1 

-1 + 1 



max{0,^+ 1} 



max I 



n\ — V e — e 
n 



+ 1 = 5(Af ) 



□ 



Lemma 3 Let a < (3 be two consecutive elements of the ascending arrangement of all distinct 
elements of {a, 6} U {| + e G (a, 6) : £ G Z} U {| - e G (a, b) : £ e Z}. Then, both g{X) and h{X) 
are constants for any A G (a,/?). 

Proof. Since a and (3 are two consecutive elements of the ascending arrangement of all distinct 
elements of the set, it must be true that there is no integer i such that a<-^ + e<Por 
a < - e < p. It follows that there exist two integers £ and £' such that {a, /?) C + e, ^ + s) 
and (a, (3) C ~ s, ~ ej . Applying Lemma [1] and Lemma [21 we have 5(A) — g + e) and 

h{X) = h - e) for any A G (a, 0). 

□ 



Lemma 4 For any A G (0, 1), lim,jjo C'CA + ??) > C'(A) and lim^|o C'CA — f]) > C{X). 

Proof. Observing that h{X + i]) > h{X) for any rj > and that 

g{X + r]) = max(0, [n(A + r] — e)\ +1) 

= max(0, [n(A - e)J + 1 + [n(A - e) - [n(A - £)J + m]\) 
= max(0, [n(A - e)\ + 1) = c/(A) 

for < r? < i+L"(^-^)J-"(^-^) ^ we have 

S{n,g{X + v),h{X + v),X + v)>S{n,g{X),h{X),X + ri) (2) 
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r n ^ , l+|n(A— e) I— n(A— e) o- 

tor < < — ^ — 5^ '-. Since 

h{X + r]) = \n{X + 7? + e)] - 1 = [n(A + e)] - 1 + [n(A + e) - rn(A + e)] + nr]] , 

we have 

\n{X + e)] for n(A + e) = \n{\ + e)] and < r] < ^, 

\n{X + e)]-l for n(A + e) / rn(A + e)] and < < rn(A+s)1 -n(A+e) ^ 



/i(A + 7?) 



It follows that both g{X + 77) and /i(A + rj) are independent of r/ if > is small enough. Since 
S{n, g,h, X + rf) is continuous with respect to r/ for fixed g and /i, we have that lim^io S{n^ g[X + 
ry), ^(A + ry), A + ry) exists. As a result, 

limC(A + r/) = lim (7(A + ry), /i(A + ry), A + r/) 
'7J.0 »7iO 

> lim5(n,<7(A),/i(A),A + 7y) = S(n,<7(A),/i(A),A) =C(A), 

where the inequality follows from ([2|). 

Observing that g{X — rf) < g[X) for any 77 > and that 

h{X-r]) = \n{X - ri + e)] - 1 

= \n{X + £)]-!+ [77(A + e)- \n{X + e)] - 777/] 
= r72(A + e)l - 1 = /i(A) 

for < 7? < l+n(A+e)-rn(A+e)1 ^ ^^^^ 

5(7i,5(A-7?),/i(A-7?),A-7?) >S(7i,5(A),/i(A),A-7?) (3) 

for < ?7 < mill {a, r«(A+e)1 |_ g-^^g 

(7(A — 77) = max(0, [7i(A — 7/ — e)J + 1) 

= max(0, ln{X — e)J + 1 + [7i(A — e) — [n(A — e)J — nr]\), 

we have 



<?(A-7?) = 



max(0, [n(A — e)J) for 77,(A — e) = [n(A — e)J and < 77 < i, 

max(0, [7i{X - e)\ + 1) for n{X - e) [n{X - e)J and < 77 < I , 



It follows that both g{X — rj) and h{X — rf) are independent of 77 if 77 > is small enough. Since 
5(77, g,h, X — 77) is continuous with respect to 77 for fixed g and h, we have that lim^|o S{n, g(X — 
rj), h{X — 77), A — 77) exists. Hence, 

limC(A — 77) = lim 5(77, (7(A — 77), /i(A — 77), A — 77) 

> mnS{n,giX),h{X),X-rj) = S{n,g{X),h{X),X)=C{X), 

rilO 

where the inequality follows from ([3j). 

□ 
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Lemma 5 Let a < (3 be two consecutive elements of the ascending arrangement of all distinct e/e- 
mentsof{a,h]\j{{+£(^ {a,b) :£eZ}U{^-e€ (a, 6) : £ G Z}. Then, C{X) > mm{C(a), C{(3)} 
for any A € (a, (3). 



Proof. By Lemma[3l both ^(A) and h{X) are constants for any A G {a, (3). Hence, we can drop 
the argument and write g{X) = g, h{X) = h and C(A) = S{n,g, h, A). 

For A G define interval [a + r], f3 — tj] with < 77 < min — a, /3 - A, . Then, 

C(A) > mm^fz[a+r],f3-ri] C{fi). Note that '^^(^Jj^'^'^^ = — and thus, for g > 0, 

dS{n,g,h,X) 



dX ~ l\ 
dS{n,0,h,X) 95(71,0,3- 1, A) 



dX 



dX 

X9-^e-^ 

¥^ 
h\ 



dX 



X''e-^ 
h\ 

Xh-9+l 



X3-^e 
hT 



1„-A 



> 



if A < 



h\ 



(9-1)! 



. From such investigation of the derivative of S{n,g, h, A) with respective to 
A, we can see that, for < 77 < min - a,/3 — A, one of the following three cases must be 

true: (1) C(/i) decreases monotonically for /i G [a + r] , f3 — rj]; (2) C(//) increases monotonically 
for n G [a + 7], (3 — rj]; (3) there exists a number 9 G {a + 7], (3 — 7]) such that C(/i) increases 
monotonically for ^ G [a + 77, ^] and decreases monotonically for ^ G (0, /? — 77]. It follows that 



C{X) > min C{n) = min{C7(a + 7?), C{P - 77)} 

^J.e[a+1^,|3~r|] 



for < r; < min - a, /? - A, 
and 



By Lemma m both lim^^o C(a + 77) and lim^jo C{f3 — rf) exist 



C(A) > lim min{C(a + r/), C(/3-?7)} 

= min|limC7(a + 77), lim C7(/3 - r?) [> > min{C(a), C(/3)} 



for any A G (a, 



□ 



Finally, to show Theorem [U note that the statement about the coverage probability follows 
immediately from Lemma [5j The number of elements of the finite set can be calculated by using 
the property of the ceiling and floor functions. 



B Proof of Theorem [2] 

Define 

C(A) = Pr 









Pr 1 


'-i-x 


< eAj 




n 





< eA ^ = Pr{5(A) <K < h{X)} 
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where 



g{X) = LnA(l - e)\ + 1, h{X) = \nX{l + e)] - 1. 



It should be noted that C(A), g{X) and h{X) are actually multivariate functions of A, e and n. 
For simplicity of notations, we drop the arguments n and e throughout the proof of Theorem [2j 
We need some preliminary results. 



Lemma 6 Let Xf 



n{l+e 



y where ^ G Z. Then, h{X) = h{Xej^i) = £ for any X G (A^, A^+i). 



Proof. For A G (A^, A^+i), we have < n(l + e) {X — A^) < 1 and 

h{X) = [nA(l + e)]-l 

= rnA^(l+e) + (l + e)(A-A^)l -1 



n 



n 



+ (l + e)(A-A,) 



1 



£-l+\n{l + e){X-Xe)] 



n 



£+1 
n{l + e] 



X (1 + e) 



1 = h{Xi+i). 



□ 



Lemma 7 Let Xi = ^(^(^^-^ where ^ G Z. Then, g{X) = g{Xe) = £ + 1 for any X G (A^, A^+i). 

Proof. For A G (A^, A^+i), we have —1 < n(l — e) (A — A^+i) < and 
giX) = LnA(l-e)J+l 

-(l-e)(A-A,+i)]J + l 

+ Ln(l-e)(A-A,+i)J + 1 

-1 + 1 
+ l = g{Xi). 



= [n[Xe 


fi(l-^) 


+ (1 




n X 


l+l 


x(l 




n(l -e) 




n X 


£ + 1 


x(l 




n(l -e) 


= £ + 1 








n X 


£ 


x(l 




n(l -e) 



□ 



Lemma 8 Let a < (3 be two consecutive elements of the ascending arrangement of all distinct 
elements of {a, b} U { „(/_g) G (a, 6) : ^ G Z} U { ^(i+e) ^ ia,b) : £ e Z}. Then, both g{X) and h{X) 
are constants for any X G {a, (3). 
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Proof. Since a and P are two consecutive elements of the ascending arrangement of all distinct 
elements of the set, it must be true that there is no integer i such that a < ^(1-^) < oi' 
a < n(i+e) ^ f^- follows that there exist two integers i and i' such that {a,f3) C , n{i'-e) ) 

and (a,/3) C ^;j^y+7) > Tlfl^i)) • ^PPly^g Lemma [6] and Lemma O we have g{X) ~ g (^ „(i^_g) ^ and 
= h (j^) for any A G (a, f3). 

□ 



Lemma 9 For any A G (0, 1), lim^^o C{\ + t]) > C{\) and lim^^o C{X — '>])> C(A). 

Proof. Observing that h{X + r/) > /i(A) for any 77 > and that 

g{X + v) = [n{X + r]){l-e)\+l 

= [nA(l - e)J + 1 + [nA(l - e) - [nA(l - e)J + nr]{l - e)J 
= L^A(1 - e)J + 1 = 5(A) 
for < 7? < '+L"A(y^)J-"A(i-e) ^ j^^^g 

5(n,(7(A + 7?),/i(A + r/),A + r/) >5(n,g(A),/i(A),A + r?) (4) 
for < 7? < ^+L"^(y^)J-"^(i-^) . Since 

h{X + r]) = [n(A + 7?)(l + e)] - 1 

= \nX{l + e)l - 1 + \nX{l + e) - [nA(l + e)] + n7?(l + e)] , 

we have 

^ f \nX{l + e)l for nA(l + e) = rnA(l + e)] and < 77 < 

[ rr7A(l + £)1 - 1 for 7iA(l + e) ^ r7iA(l + e)] and < 7; < r"A(i+e)1-»A(i+e) _ 

It follows that both (^(A + 77) and h{X + rf) are independent of 77 if r/ > is small enough. Since 
S{n,g, h, X + r]) is continuous with respect to r] for fixed g and /i, we have that lim^jjo S{n,g{X + 
77), h{X + 77), A + 77) exists. As a result, 

limC(A + r/) = lim5(72, (^(A + 77), /i(A + 77), A + r/) 

rilO r]lO 

> limS{n,g{X),h{X),X + r])=S{n,g{X),h{X),X)=C{X), 

where the inequality follows from ([3]). 

Observing that g{X — rf) < g{X) for any ij > and that 

h{X-r]) = [7i(A-77)(l + e)] - 1 

= r7iA(l +e)]-l + \nX{l + e) - [7zA(l + e)] - 7177(1 + e)] 
= r7iA(l - 1 = /7(A) 
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for < r? < ' ^« 

5(n,5(A-r?),/i(A-r?),A-r?) >5(n,5(A),MA),A-r?) (5) 

for 0<,7<min{A,i+Ii^iii^i^g^ii+£ll}. Since 

5(A-?7) = [n{X-v){l-e)\+l 

= [nA(l - e)J + 1 + [nA(l - e) - [nA(l - e)J - n7]{l - e)\ , 

we have 

f L"A(1 - e)J for nA(l - e) = L«A(f - e)J and < r; < 

^ I L"A(f - £)J + f for nX{l - e) ^ [nA(f - e)J and < ?? < "^^^"l\"iL"y^"'^^ ■ 

It follows that both g{X — rf) and h[\ — rf) are independent of 77 if ?]> is small enough. Since 
S{n,g, h, X — i]) is continuous with respect to t] for fixed g and h, we have that lini^jo S{n,g(X — 
rj), h{X — 1]), X — T]) exists. Hence, 

limC(A — ry) = lim 5(ri, (^(A — ??), /i(A — ?/), A — r/) 

77J.O 7;i0 

> limS(n,5(A),/i(A),A-7?) =5(n,<7(A),/i(A),A) =C(A), 
where the inequality follows from (0). 

□ 

By a similar argument as that of Lemma O we have 

Lemma 10 Let a < 13 be two consecutive elements of the ascending arrangement of all dis- 
tinct elements of {a,b} U {^^(jr^ G (a, 6) : £ G Z} U {j^^ G (a, 6) : £ G Z}. Then, C(A) > 
min{C(a), C(/3)} for any X G {a, (3). 

Finally, to show Theorem [2l note that the statement about the coverage probability follows 
immediately from Lemma flUl The number of elements of the finite set can be calculated by using 
the property of the ceiling and fioor functions. 
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